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(54) System and method for accessing TV-related Information over the internet 



(57) The system retrieves information from the in- 
ternet using multiple search engines that are simurtane- 
ously launched by the search engine commander. The 
commander is responsive lo a speech-enabled system 
including a speech recognizer and natural language 
parser. The user speaks to the system in natural lan- 
guage requests, and the parser extracts the sennantic 
content from the user's speech, based on a set ol goal 



oriented grammars. The preferred system includes a 
fixed grammar and an updatabte or downloaded gram- 
mar, anowing the system to be used without extensive 
training and yet capable of being customized for a par- 
ticular user's purposes. Results otjtained from the 
search engines are tillered based on infomiation ex- 
tracted from an electronic program guide and from 
presiOTBd user profile data. The results may be <$&- 
played on screen or through synthesized speech. 
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Description 

Background and Summary of the Inventioa 

[0001] The presem invention relates gencrany to in- 
teractive tetevision and tn1orn\ation retrieval More par- 
licularty. the invention relates to a speech-enabled sys- 
tem whereby a user's spoken requests for infonmaiion 
are recognized, parsed and supplied to a search engine 
for retrieving information pertinent to the user's requesl. 
[0002] The number and variety of TV programs avail- 
able to viewers is growing rapidly. Thus viewers require 
a rapid. user-friend»y way of searching for broadcasts 
that sun their tastes and needs. Much information about 
TV progran^s is avaflabie on various internet sites, but 
access to those sites requires logging onto a computer 
and typing in key words. 

[0003] Ideally, me user would Hke to be able to ot>iavn 
inlormailon from Intemei sites while he or she Is using 
the tetevision. by making spoken requests to the leJevi- 
sion and havkig it obtain the requested information. 
Thus a user could simpfy tefl the television what he or 
she wants to see: 'Show me any international walerpolo 
evenr . for example. ar>d the TV wouHJ access the Inter- 
not to fmd out when and on what channel such a pro- 
gram is broadcast. Using the infomnation as download- 
ed, the TV would also be able to answer questions about 
the broadcast such as "What teams are playing?" 
[0004] By way of further example , the user, viewing a 
particular program about mountain climbing, might want 
more information about the tallest mountain peaks and 
when they were first climbed. The user woukS like to be 
able to ask the television to find answers to these ques- 
tions and then display the results on screen or through 
synthesized spoken response. 
[OOOSJ Unfortunately, this type of sophisticated inter- 
action with xtye television has not been posstole. The 
present inventon breaks new ground in this regard. The 
invention provWes a speech recognition system with as- 
sociated language parser that wfU extract the semantic 
content or meaning from a users spoken command or 
inquiry, andfonnulate a search request suitable for sup- 
plying to one or more internet search engines. The pars- 
er contains a reconfigurable grammar by which it can 
understand the meaning of a usefs spoken request 
within a given context. The grammar Itself may be recon- 
figured via the Internet, based on knowledge of what the 
user IS currently viewing. This knowledge may be sup- 
plied by electronic program guide or as part of the digital 
television data stream. 

[0006] The results obtained from the search engines 
may be further anatyzedby the parser, to select the most 
likely candidates that respond to the user's original In- 
quiry. These results are then provided to the user on 
screen or through synthesized speech, or both. 
[0007] For a more complete understanding of the irv- 
vention, its objects and advantages, refer to the foltow- 
Ing speciTicalton and to the accoouJanying drawings. 



Brief Description of the Drawings 



[0008] 



9 Figure 1 is a block diagram of the presently pre- 
ferred embodiment of the invention; 

Rgure 2 is a block diagram depicting the compo- 
nents of the natural language parser of the present- 
ly preferred en^odimem of the invention; and 

10 Figure 3 is a block diagram depicting the compo- 
nents of the local parser of the presently preferred 
emfbodimsni of the invention. 

Description of the Preferred Embodiment 

IS 

[0009] BeJenring to Rgure i, a presently preferred 
embodfmeni of the speech-enabled information access 
system comprises a speech recognizer 10 to v^liich in- 
put speech Is supplied through suitable mfcrophone in- 
20 lertace. In this regard, the microphone can be attached 
by cable or coupled through wireless connection to 
speech recognfzer 10. The microphone may be pack- 
aged, for example, within the handheld ren>ote of a tel- 
evision or other information appliance. 

25 [0010] The output of speech recognizer 1 0 is couplod 
to natural language parser 12. The natural language 
parser extracts the semantics or meaning from the spo- 
ken words, phrases and sentences supplied by the user. 
As wai be discussed more fufly below, natural language 

30 parser 1 0 works with a set of pre-<lef ined grammars that 
are preferabV constnjcted based on goal-oriented 
tasks. In the presently preferred embodiment these 
grammars may be categorized as one of two types: a 
fixed grammar 14 and a downloaded grammar 16. 

35 [0011] TTie fixed grammar represents a pre-defined 
set of goaVoriented tasks that the system is able to per- 
forni immediately t4X>n installation. For example, the 
fixed grammar would aHow the natural language parser 
to understand sentences such as "Please find me an 

40 international water poto event." 

[0012] Expanding upon the fixed grammar, an option- 
al, (towntoaded grammar 15 can be added to tSe sys- 
tem, giving the natural language parser the ability to un- 
derstand diflerem classes of sentences not originally 

45 provkled for in the original package. These addiUonai 
downloaded grammars can be used to expand the ca- 
pability of the system periodically (when the system 
manufacturer develops new enhancements or new fea- 
tures) or to add third-party enhancements that the user 

so may be particularly mterested in. 

[0013] For example, if a particular user is interested 
in playing chess interactively with users around the 
world, the downloaded gramnoar can be augmented to 
include the necessary grammars to give chess move 

55 commands to the system. 

[0014] Much of the power underlying the system 
comes from its abiTrty to access the rich inf onmatton con- 
tent found on the internet. The system includes a search 
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engine commander 16 which receives SGmaniic instruc- 
ifons Irom natural language parser 12. The search en- 
gine comnnander lies at the hub of a number of inTonna- 
(ion handling processes. The search engine commantJ- 
er is coupted to the internet connection module 20. 
which has suitable TCP/1? protocols necessary for com- 
municaiion v/iih a suitable service provider giving ac- 
cess 10 the Internet 22. The search engine commander 
formulates search requests, based on the user's input 
as derived by the natural language parser 1 2. The conv 
mander 1 8 formulates search requests to be suitable for 
hancfing off to one or more search engrnes that are n^in- 
lained by It^ird parties on the intemet. In Figure 1 three 
Search engines are shown at 24. Examples of suitable 
search engines include: Yahoo. AltaVista, Exdte, Lycos. 
GoTo, and so forth. In essence, the search englr^e com- 
. mander 18 communteaias with an of the search engines 
In parallel, sending each of them off on the task of locat- 
ing information responsive to ihe user's spoken inquiry. 
[0015] The search engines, in turn, identify informa- 
tion found on the internet that respond to the user's re- 
quest. Typically, search engines of this type return a pri- 
ority score or probability score indicative of how likely 
the retrieved information is responsive to the user's re- 
quest In this regard, different search engines use dif- 
ferer^ algorithms for determining such probabilUies. 
Thus having the ability to access muttiple search en- 
gines in parallel improves the richness of the information 
retrieved. In other words, not all search engines win re- 
turn the same informaUon for every inquiry made, but 
tha combined effect of using search engines produces 
richer results than any single search engine alone. 
fOOiej The search engines retum a fist of Bnks (e.g.» 
hypertext links or URL addresses) that are responsive 
to the request. Typically, the returned information is sort- 
ed by probability score, so that the sites most likely to 
contain relevant informatiai are presented first. 
10017] The returned results are fed back to search en- 
gine commander 1 8, and search engine commander 1 8, 
in turn, passes the results to the search results proces- 
sor 26 for fiitering. Typically a user of this system does 
not want to see every piece of iofornnation identified by 
the search engines. Rather me user is typicafly interest- 
ed in the best one or two information resources. To fitter 
the results, search engine processor 26 may have op- 
tional information filiefs 28 that are based on user-de- 
fined preferences. These filters he^ processor 26 de- 
termine whch responses are Ukefy to be more interest- 
ing to the user and which responses should be discard- 
ed. The presently pref emed embodiment updates these 
information niters on a per-user basts, based on histor- 
ical data gathered as the user makes use of the system, 
[001 8] A very important item of informaUon in filtering 
the search results comes from the knowledge of v/fiat 
the user is currently viewing. This information is extract- 
ed from an electronic program guide, which may be lo- 
cally stored as at 30 for access by the search engine 
commander. The electronic program guide contains in- 



formation about each program that is available for view- 
ing over a pre-defined time interval- The guide includes 
tfie date and time of the program, the program title, and 
other useful information such as what category the pro- 
5 gram falls into (e.g.. comedy, drama, news, spons. etc.), 
what actors star in the program, who directed the pro- 
gram, and so forth. Often this infomiation is relevant in 
determining what information the user is interested in 
retrieving. 

10 [00191 f^or example, if the user is watching a movie 
starring Marilyn Monroe, the user may be interested in 
learning more about this actress* Bfe. The user could 
thus ask the system to Tell me more about the main 
actress' life" arwl VttQ system would ascertain from the 

*5 electronic program guide that the actress *s Marilyn 
Monroe. 

[0020] The friformation contained m the electronic 
program guide can be used in multiple ways. The search 
engine commander can make use of this informaUon in 

^ formut3tir>g ns requests for information that are sent to 
the search engines 24. In additkjn. when the information 
is returned by the search engines, the search engine 
commander IB can pass the relevant electronic pro- 
gram guide data down to the search results processor 

^5 2S along v^ith the search results . This aitows the search 
results processor to use relevant electronic program 
guide information in Rttering the results obtained. 
[0021] Because the electronic program guide chang- 
es over time, it is necessary to update the contents of 

30 the electronk: program guide data store 30 on a periodic 
basis. The search engine commander does this auto- 
nnatfcaily by accessing the ixuemet Attemativety. if de- 
sired, the electronic program guide information can be 
obtained through the letevisfon system's cable or satel- 

as Btefink. 

[0022J While the system described above has the 
ability to access ar\y information available on the inter- 
net, a panicuiarfy mbust embodtfnent can be imple- 
mented by designating certain pre-defined sites that 
^ contain information the user has seleaed as being of 
interest, or sites designated by the system rrianuf aclurer 
as containing informairon of interest to most viewers, tn- 
formation retrieved from such pre-designated sites can 
be retrieved and comnvjnicated to the user more quick- 
V. t)ecause itiere is no need to invoke search en^^es 
to scour the entire tXKJy of information avaHaDle on the 
IntemeL 

[0O23] By way of iUusiralion. the system may be pre- 
configured to access an on-line encyclopedia internet 

so site which is used to supply oomnnonly requested Infor- 
mation about progran\s the user Is viewing. For exam- 
ple, if the user is watching a movie atwut Intfa, the sys- 
tem rr>ighi automatically retrieve relevant statistics 
about that country and provide them on screen in re- 

55 sponse to a user's request. 

[D024] An interesting enhancement of this capatriBty 
Involves ttie presentation of multtnnedia data or stream- 
ing data from the pre-selected internet web site. By pro- 
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vidiog screening data, me user is given the experience 
ot actually viewing ihe supplemental material as a lilm 
dp Of animation. Such nim clips or animations could be 
viewed, for example, during commercial breaks. Alter- 
natively. U the user Is enjoying a television system that 
provides video on demand, the user could lemporally 
suspend transmission of the original program to allow 
viewing of the supplemental infomnalion provided from 
the pre-detined Internet site. 

[0025] The search engine commander, itself, mairv 
tains a user profile data store 32 that may be used to 
further enhance the usefulness oJ the system. User pref- 
erences stored in the user profile data store can be com- 
bined with mfomfiaiion in me electronic program guide 
to generate search requests automatically, flius. if the 
system has ascenained from previous usage mat the 
viewer is interested in certain internaUonal events, the 
search engine commander will autonriaticaily send re- 
quests for relevant infoonaiion and can cause the rele- 
vant information to be displayed on the screen, depend- 
ing on vyfhether such information is suitable in the cun-eni 
viewing context. For exarr^ie. if important news about 
a viewer's home country is found, it could be displayed 
on screen while the international news is being viewed. 
The same message migfit be suppressed if the viewer 
is watching a movie that may be siniultaneousiy being 
recorded. 

[0026J The presently preferred embodiment uses a 
natural language parser that is goai-oriented. Figure 2 
depicts components of the natural language parser 12 
in more detail, in particular, speech understanding mod- 
ule 12a includes a local parser 160 to identify predeter- 
mined relevant task-related fragments. Speech under- 
standing module 128 also includes a global par5erl62 
to extract the overatl senunlics of the speaker's request. 
[0027] The local parser 160 utilizes in the preferred 
embodiment small and multiple grammars atong with 
several passes and a unique scoring mechanism to pro- 
vide parse hypotheses. For example, the novel k>cal 
parser t02 recognizes according to this approach 
phrases such as dales, names of people, and movie cat- 
egories. If a speaker utters "tell me about a comedy in 
which Mel Brooks stars and is shown before January 
23rd". the local parser recognizes: "conned/* as being a 
movie category; "January 23rd" as a date; and "fVIel 
Brooks" as an actor. The gkjbal parser assembles those 
items (movie category, dale, etc) together and recog- 
nizes that the speaker wishes to retrieve information 
about a movie with certain constraints. 
[0026] Speech understanding module 128 ii>cludes 
knowledge database 163 which encodes the semantics 
of a domain (i.e., goal to be achieved). In this sense, 
knowtedgo database 1 63 is preferably a domain-sp^cif- 
*c database as depicted by reference numeral 165 and 
is used by diatog manager 130 to determine whether a 
particular action related to achieving a predetermined 
goal is possit>le. 

[002d] The preferred embodiment encodes the se- 



mantics via a frame data srnjclure 164. The frame data 
structure 164 contains empty slots 166 which are filled 
when the semantic interpretation of global parser 162 
matches the frame. For example, a frame data structure 

5 (whose domain is tuner commands) includes an empty 
stol for specifying the viewer-requested channel for a 
time period. Ifviewer 120 has provided the channel, then 
that empty slot is filled with that information. However, 
if that partfcular frame needs to be filled after the viewer 

10 has iniUany provided its request, then dialog manager 
130 instructs computer response module 134 to ask 
viewer 120 to provide a desired channel. 
[0030] The frame data structure 164 preferably in- 
cludes multiple frames which each In turn have multiple 

15 slots. One frame may have slots directed to attributes 
of a movie, director, and type of movie. Another frame 
may have slots directed to anribuies assocated with the 
time in which the movie is playing, the channel, and so 
forth. 

£0 [0031] The foUowing reference discusses globav pars- 
ers and frames: R. Kuhn and R. D. Mori. Spoken Dia- 
logues with Con^ters (Chapter 14: Sentence tnterpre- 
tationl Academic Press, Boston (1998). 
[0032] Dialog manager 130 uses dialog history data 
55 file 1B7 to assist in filfing in empty slots before asking 
the speaker for the information. Dialog history data nre 
167 contains a log of the conversation whfch has oc- 
cuaed through the device of the present invention. For 
example, if a speaker utters "I'd nke to watch another 

3P MarHyn Monroe movie." the dialog mar^ager 130 exam- 
ines the dialog history data file 167 to check what nnovies 
the user has already viev/ed or rejected in a previous 
cfialog exchange. If the speaker had previously rejected 
-Some Uke U Hot, then the dialog manager 130 fiRs the 

3S empty slot of the movie title with movies of a different 
title. If a sufficient number of slots have been mied, then 
the present invention will ask the speaker to verHy and 
confirm the program selection. Thus, if any assumptions 
made by the dialog manager 130 through the use of di- 

<o alog histoiy data file 167 prove to be incorrect, then the 
speaker can correct the assumption. 
[0033] TTie natural language parser 12 analyzes and 
extracts semanticalty important and meaningful topics 
from a loosely structured, natural language text which 

« may have been generated as the output of an automatic 
speech recognition system (ASR) used by a dialogue or 
speech understanding system. The natural language 
parserl2 translates the natural language text input to a 
new representation by generathg weB-structured tags 

so containing topfc information and data, and associating 
each tag with the segments of tfte input text containing 
the lagged information. In addition, tags may be gener- 
ated in other f omis such as a separate list, or as a se- 
rnantic frame. 

55 [0034] Robustness b a feature of the natural lan- 
guage parser 12 as the input can contain grammatfcally 
incorrect English sentences, due to the following rea- 
sons: the input to the recogntzer is casual, diatog style. 
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natural speech can contain broken sentences, part.a! 
phrases and me insertion, omissron. or mis-T6cognition 
ot errcfs by ihe speech recognizer even when the 
speech input is considered correct. The natural lan- 
guage parser 12 deals robustly with all types ot rnput 
arid exifHCis as much information as possib\e. 
[00351 Figure 3 depicts the different components of 
the loca'. parser 160 o1 the natural ianguage parser 24. 
The natural language parser 12 preferably utilizes gen- 
eralized parsing techniques in a multi-pass approach as 
a fixed-point compulaUon. Each topic is descril>ed as a 
context-sensitive LR (left-ilght and rightmost derivation) 
grammar, allowing ambiguities. The toilowing are refer- 
ences related to context-sensitive LR grammars: A. Aho 
and J. D. UUman. Princff5/es of Compiler Design, Addi- 
son Wesley Publisning Co., Reading. Massachusens 
(1977): and N. Tomita. GenerBBzed LH Parsing. Kixiwer 
Academic Pubfishers. Boston, ftrtassachuselts (1991). 
[0038] At each pass of the compulation, a generafized 
parsing algorithm is used to generate pref eraWy alt pos- 
sible (both complete and partial) parse trees independ- 
enity for each targeted topic. Each pass potentiatty gen- 
erates several alternative par^e-lrees. each parse-tree 
representing a posstoly different interpretation of a par- 
ticular topic. The multiple passes through preferably 
parallel and independent paths result in a substantial 
elimirtation of artibiguiltes and overlap among different 
topics. The generalized parsing algorithm Is a system- 
atic way of scoring an possiWa parse-trees so that the 
(N) best candidates are seiecied utiRzing the contextual 
inlormaiion present in the system. 
[0037] Local parsing system 160 is carried out in three 
stages: lexical analysis 220; parallel parse-forest gen- 
eration for each topic (lor example, generators 230 and 
232); and analysis and synthesis of parsed components 
as shown generally by reference numeral 234. 

Lexical anafysisj 

(00381 A speaker utters a phrase thai is recognized 
by an airtomatic speech recognizer 217 which gener- 
ates input sentence 218. Lexical analysis stage 220 
idenuries and generates tags for the topics (which do 
not reqtflre extensive grammars) in Input sentence 218 
using lexical filters 226 and 226. These tfx;iude. for ex- 
ample, movie names; category of movie; producers; 
names of actors and actresses; and so forth. A regular- 
expression scan of the input sentence 218 using the 
keyvwjftis involved In the mentioned exemplary tags is 
typically sufficient at this level. Also, performed at this 
stage is the tagging of words in the input sentence that 
are not part of the lexicon of particular grammar. These 
words are incScated using an X-iag so that such noise 
words are replaced with the letter "X". 

Parallel Darse4orest oeneration: 

[0039) The parser 1 2 uses a high-level general part- 
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ingstrategy to describe and parse each topic separately, 
and generates tags and maps them to the input stream. 
Due to the nature of unstructured input text 218. each 
individual topic parser preferaWy accepts as large a lan- 
guage as possible, ignoring all but important words, 
dearmg with insertion and deletion errors. The parsing 
of eachtopic involves designing context-ssnsiltv/e gram- 
mar rules using a meta-level specification language, 
much like the ones used in LR parsing. Examples ot 
grammars include grammar A 240 and grammar B 242. 
losing the present invention's approach, topic grammars 
240 and 242 are described as if they were an LR-type 
grammar, containing redundancies and without elimi- 
nating shift and reduce conflicts. The resuft of parsing 
an input sentence is alt posstole parses based on the 
grammar specificaiion^- 

[00401 Genarators 230 and 232 generate parse for- 
ests 250 and 252 for their topics. Tag-generation Is dona 
by synthesizing actual information found in the parse 
tree obtained during parsing. Tag generation is accom- 
plished via tag and score generators 260 and 262 which 
respectively generate lags 264 and 266. Each identiHed 
tag also carries information about what set of input 
words in the input sentence are covered by the tag. Sub- 
sequently the lag replaces its cover-s^ In the preferred 
embodimem. context information 267 is uUfeed for tag 
and score generations, such as by generators 250 and 
262. Context infomnation 267 is utilized in the scoring 
heuristics for adjusting weights associated with a heu- 
ristic scoring factor technique that is discussed betow. 
Context information 267 preferably includes word con- 
fidence vector 268 and dialogue context weights 269. 
However, a should be understood mat the parser 12 is 
not limited to using both word confidence vector 268 and 
dialogue context weights 269, but also includes using 
one to the exclusion of the other, as weH as not utilizing 
contaxt information 257. 

(0041] Automatic speech recogniBon process block 
217 generates word confidence vector 268 which indi- 
cates how wen the words in input sentence 218 were 
recognized. Dialog manager 130 generates dialogue 
context v/eights 269 by determining the state of the di- 
alogue. For example, dialog manager 130 asks a user 
about a particular topic, such as, what viewing time ts 
preferable. Due to this request, dialog manager 130 de- 
termines that the statepf the dialogue Is time-oriented. 
Dialog manager 130 provides dialogue context weights 
269 in order to inform the proper processes to more 
heavily weight the detected time-oriented words. 
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Synthesis of Ta q-coft^nents: 



I0042J The topfc spotting parser of the previous stage 
generates a significant amount of informationihat needs 
55 10 be analyzed and combined together lo form the final 
output of the local parser. The parser 12 is preferably as 
•aggressive" as posstole in spotting each topic resulting 
in the generation of multiple tag candidates. Additionally 
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in the presence of numbers or certain key-worcJs, such 
as -between". Tsefore*, ''and\ 'or*, "arouncT. etc.. and 
especially '^ese words have been introduced or 
dropped due to recognition enors it is possiWe to con- 
struct many atternative tag candidates. For exanTpte. an 
input sentence could have insertion or deletion errors, 
xne combining phase determines which tags form a 
more meaningful inlerpretalion of the input. The parser 
12 deftnes heuristics and makes a selection based on 
them using a N-Bestcandidatg selection process. Each 
generated tag corresponds to a set of words in lt\e input 
word string, called the lag's cover-set. 
[O043] A tieuristic is used that takes into account the 
cover-sets of the lags used to generate a score. The 
score roughly depends on the size of the cover-set. the 
sizes in the number of the words of the gaps within the 
covered itents, and the weights assigned to the pres- 
ence of certain KeywonSs. m the preferred embodiment, 
ASR- derived confidence vector and dialog context in- 
formation are utaized to assign priorities to the tags. For 
example app(yif>g channel-tags parsing first poteneany 
removes channel-related numbers that are easier to 
identify uniquely from the input stream, and leaves fewer 
numbers to create ambiguities with other tags. Prefera- 
bly, dialog context information is used to adjust the pri- 
orities. 

N-Best Candidates Selection 

[00441 At the end of each pass, an N-besl processor 
270 selects the N-best candidates based upon the 
scores associated wfth ihe tags and generates ine topic- 
tags, eacf^ represeriting the information fourwi in the cor- 
responding parse-tree. Once topics have been discov^ 
ered this way, the corresponding words in the input can 
be substituted with the tag information. This substitution 
transfonmaiion eliminates the conrespondingv/ords from 
the cun^ent input text. The output 280 of each pass is 
fed-back to the next pass as the new input, since the 
substitutions may help tfi ihs elimination of certain am- 
biguities among competing granwars or he^ generate 
better parse-trees by filtering out overlapping symbols. 
[0045] Computation ceases when no additional tags 
are generated In me last pass. The output of tne lina! 
pass becomes the output of the bcai parser to global 
parser i$2. Since each phase can only reduce the 
number of words in *s input and me length of the input 
text is finite, the number of passes in the Hxed-point 
computation is linearty bounded by the size of its input. 
[0046] The following scoring factors are used to rank 
the alternative parse trees based on the following at- 
tributes of a parse-iree: 

• Number of tenmlnal symbols. 

• Number of non-terminal symbols. 

• The depth of the parse-tree. 

• The size of the gaps in the terminal symt>ols. 

• ASR-Confidence measures associated with each 



terminal symbol. 
• Coniexi-adiustable weights associated with each 
terminal and non-tem>inal symbol. 

5 [0047) Each path preferably corresponds to a sepa- 
rate topic that can be developed independently, operat- 
ing on a smaH amount of data, in a computationally in- 
expensive way. The archHecture of the natu ral language 
parser 1 2 is flexible and modular so incofporating addi- 

10 tional paths and grarmiars. for new topics, or changing 
heuristics for particular topics is straightforward. thisaJ- 
so aHows developirvg reusable components that can be 
shared among different systenw easily. 
[0048] From the foregoing ft will be appreciated that 

»5 the present invention is wen adapted to providing useful 
information obtained from the internet to me T\^ viewer 
The speech-enabled, natural language inieri'ace cre- 
ates a user friendly, easy to use system that can greatly 
enhance the enjoyment and usefulness of both televi- 

^ sk>n and the. internet. The result of using the system is 
a natural blend of passive television viewing and inter- 
active internet iftformalion retrieval. 
[0049] While the invention has been described in its 
presently preferred embodiment. H will be understood 

2S that the invention te capable of modification without de- 
parting from the spirit of the invention as set forth in the 
appended claims. 

yo Claims 

1. A system for accessing supplemental networt^-res- 
ident infomiation about an audio/video program 
comprising: 

35 

a network connection through which network- 
resident information may be obtained: 
a speech recognizer receptive of a user's input 
speech request for inlormatioo about a pro- 
40 gram; 

a natural language parser coupled to said 
speech recognizer (or extracting a semantic 
representation of the user's request for infor- 
mation; 

45 a data wore for storing a representation of an 

electronic program guide; 
a search engine commander coupled to said 
parser lor issuing at least one search request 
to at least one search engine through said net- 

so woii^ connection based on said semar^tic rep- 

resentation and based on said electronic pro- 
gram guide; and 

a search results processor for receivir\g search 
results tf) response to said search request and 
55 for providing at least a portion of Ihe received 

search results to the user as information about 
an audioArideo progranv 
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2. The system of claim t further comprising, speech 
synthesizer coupled to said search results proces- 
sor for prtwiding the tjser with synthesized speech 
information about an audio/video program. 

5 

3. The system of daim 1 vrfieiein said network corv 
necUon provides connection lo the internet. 

4. The system of daim 1 wherein said searct* engine 
accesses at least one predefined site containing in- io 
formation about predefined topics pertaining to an 
audio/video program. 

5. The system of daim l wherein sard search engine 
comnnander indudes a user proHte data store fof « 
storing historical data about prior requests by the 
user for irvfonnation, 

6. The system of daim 1 wherein said search engine 
commander includes a mechanism for updating me so 
contents of said efectronic program guide data 
store. 

7. The s^tem of ctewn 1 wherein said natural lan- 
guage parser includes a set of predefined goal-or*- 25 
ented grammars. 

8. The system of claim l wherein said natural lan- 
guage parser inctudes a data store for storing a set 

X)f grammars that are downloaded through said net- so 
work connection. 
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System and method for accessing TV-related Information over the Internet 



(57) The system retrieves information from the In- 
ternet using muHipieseajdi engines that are simuftane- 
ously launched by the search engine commander. The 
commander Is responsive to a speech-enabted system 
inctutfing a speech recognizer and natural language 
parser The user speaks to the system in natural lan- 
guage requests, and the parser extracts the semantic 
content from the usefs speech, based on a set of goal 



(^tented grammars. The preferred system bciudes a 
nxed grammar and an updataWe or downloaded granv- 
mar, ailowtng the system to be used without extenswe 
training aruJ yet capable ot being customized tor a par- 
toJlar user's purposes. Results obtained from the 
search engines are fUlered based on krformatlon ex- 
tracted from an eiectrowc program giade and from 
presiored user profile data. The results may be dis- 
played on screen or through synthesized speech. 
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